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Most new emerging viruses are derived from strains circulating 
in zoonotic reservoirs. Coronaviruses, which had an 
established potential for cross-species transmission within 
domesticated animals, suddenly became relevant with the 
unexpected emergence of the highly pathogenic human SARS- 
CoV strain from zoonotic reservoirs in 2002. SARS-CoV 
infected approximately 8000 people worldwide before public 
health measures halted the epidemic. Supported by robust 
time-ordered sequence variation, structural biology, well- 
characterized patient pools, and biological data, the 
emergence of SARS-CoV represents one of the best-studied 
natural models of viral disease emergence from zoonotic 
sources. This review article summarizes previous and more 
recent advances into the molecular and structural 
characteristics, with particular emphasis on host-receptor 
interactions, that drove this remarkable virus disease outbreak 
in human populations. 
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Introduction 

Coronaviruses have an established potential for cross¬ 
species transmission that became broadly recognized with 
the emergence of a novel human coronavirus, Severe 
Acute Respiratory Syndrome Coronavirus (SARS-CoV), 
in 2002. SARS was first identified as an atypical pneu¬ 
monia in isolated patients in Guangdong Province, China. 
The disease reached epidemic proportions following key 
super spreader events that were associated with a novel 
respiratory virus introduction into a globalized com¬ 
munity. SARS-CoV caused about 8000 infections and 
800 deaths worldwide by July 2003, by which time 
aggressive public health intervention strategies contained 
the epidemic absent any effective therapeutics [1]. The 


decimating lethality of SARS-CoV emergence was borne 
largely by the elderly, in whom mortality rates 
approached 50% or more. A subsequent explosion of 
coronavirus research identified SARS-CoV in several 
small carnivores (palm civets and raccoon dogs) of the 
Chinese wet markets and SARS-like CoV in the predicted 
reservoir host, horseshoe bats (genus Rhinolophus). The 
vastly expanded CoV phylogeny includes two novel 
human coronaviruses (NL63 and HKU1) and ultimately 
tripled the number of full-length genome sequences 
available in GenBank. SARS-CoV was shown to use a 
novel host receptor, Angiotensin Converting Enzyme 2 
(ACE2), for docking and entry and the viral attachment 
protein, Spike, was extensively characterized both as a 
determinant of host specificity and as a therapeutic target. 
The more recent studies of coronaviruses have progressed 
to increased surveillance and characterization of numer¬ 
ous new coronaviruses circulating in bats, bids, and other 
species, integrated bioinformatics and microbiological 
studies, and extensive evaluations of potential thera¬ 
peutics [2]. 

Coronavirus phylogeny and mechanisms of 
genome diversity 

Following the SARS-CoV outbreak a surge in global 
coronavirus genome sequencing efforts vastly expanded 
our insight into the CoV phylogeny and resulted in the 
definition of several subclassifications (Figure 1). The 
greatest contribution of new strains was derived from the 
newly discovered bat coronavirus (BtCoV), which may be 
the source of most, if not all, mammalian CoVs identified 
to date [3-9]. The high diversity of coronaviruses is 
attributable to three viral traits [10]. The first character¬ 
istic is the potentially high mutation rates associated with 
RNA replication, generally estimated as 10 to 10 . 

Surprisingly, the estimated mutation rate for SARS-CoV 
and other coronaviruses approached 2 x 1(T 6 [11-13], In 
contrast to other RNA viruses, recent data suggest that 
coronaviruses encode an RNA proof-reading activity 
associated with the 3'-5' exonuclease activity encoded 
within nspl4 [14]. It is not clear whether RNA proof¬ 
reading fidelity is altered in changing environmental 
settings or during virus replication under stress related 
conditions, but such possibilities may allow for rapid virus 
evolution in changing ecologic conditions [14]. Second, 
recombination frequencies within the coronavirus family 
have been calculated to be as high as 25% during mixed 
infection, likely the result of discontinuous RNA tran¬ 
scription and the presence of full length and subgenomic 
negative strand RNAs that allow for frequent strand 
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Figure 1 
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Spike phylogeny of representative CoVs and models of SARS-CoV 
emergence, (a) The Spike peptide sequence of 40 representative CoVs 
demonstrates that CoVs make up three distinct groups named alpha, 
beta, and gamma. These names replaced the former group 1, 2, and 3 
designation, respectively. Classical subgroup clusters are marked as 
2a-2d for the beta CoVs and 1 a and 1 b for the alpha CoVs. The tree was 
generated via Maximum Likelihood using the PhyML package. Major 
branch labels represent bootstraps that were greater than 70. SCoV: 
SARS-CoV; BtSCoV: bat SARS-like CoV; BtCoV, ZBCoV, and ARCoV: 
bat CoVs; HCoV: human; FCoV and FIPV: feline CoVs; BCoV: bovine; 
IBV: avian; PHEV, TGEV, PRV, PEDV: porcine CoVs; and MHV: murine 
hepatitis virus, (b) Competing models of SARS-CoV emergence. Early 


switching and recombination between viral genomes and 
subgenomic replication complexes [15,16]. The role of 
discontinuous transcription in recombination is supported 
by the higher rate of recombination toward the 3' ends of 
viral genomes and by targeted RNA recombination tech¬ 
niques designed to genetically manipulate the 3' end of 
the genome [17]. Although poorly studied, conservation 
of transcription regulatory sequence (TRS) sites across 
viral species may implicate these sequences as foci or hot 
spots of recombination [17]. Thirdly, as the largest of the 
RNA viruses at ~27-31 kb, coronaviruses have both 
increased opportunity for change and room for modifi¬ 
cation, clearly evidenced by the presence of numerous 
unique open reading frames and protein functions 
encoded toward the 3' end of the genome [10]. These 
genomic characteristics allow for rapid adaptation to novel 
hosts, ecological niches, tissue tropism, and even gener¬ 
ation of novel coronavirus species, as seen in the gener¬ 
ation of FIPV type II strains from double recombination 
events between FIPV type I and CCoV [10]. 

Multiple incidents of cross-species 
transmission 

Coronaviruses have a strong history of host shifting as 
evidenced by phylogenetic incongruences in the family 
tree [18]. In addition to SARS-CoV, two human corona¬ 
viruses, HCoV-OC43 and HCoV-229E, are now also 
recognized as having likely emerged from animal reser¬ 
voirs. HCoV-OC43 and bovine coronavirus (BCoV), both 
betacoronaviruses, have very high sequence similarity 
suggesting a recent and common origin (Figure 1). Mol¬ 
ecular clock analysis of the Spike glycoprotein of both 
species estimates that HCoV-OC43 originated from a 
BCoV ancestor around 1890 [19]. Similarly, HCoV- 
229E likely emerged from a bat alphacoronavirus approxi¬ 
mately 200 years ago [20]. In an example of reverse 
zoonosis, porcine epidemic diarrhea virus emerged sud¬ 
denly in the early 1980s, most likely originating from 
HCoV-229E [20]. Additionally, a coronavirus isolated in 
1988 from a child with acute diarrhea, HECV-4408, was 
shown to be closely related to bovine coronavirus (BCoV), 
indicating the continued introduction of zoonotic coro¬ 
naviruses into human populations [21]. The origins of 
HCoV-NL63 and HCoV-HKUl, the most recently dis¬ 
covered human coronaviruses, remain under study. The 
most recent example of zoonotic emergence of a human 
coronavirus is the example of SARS-CoV, which had at 
least two independent emergence events from zoonotic 
reservoirs, recognized in 2002 and 2003 [22]. The most 
recent phylogenetic data estimate the emergence of 
SARS-CoV some seven years earlier, consistent with 


data suggested that SARS-CoV initially jumped from the zoonotic 
reservoir, bats, to palm civets, followed by a second jump from civets to 
humans (blue arrow). More recent phylogenetic and receptor analysis 
studies suggest a direct emergence from bats to humans, with 
subsequent cross-transmission between humans and civets (red arrow). 
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the identification of low sero-positive cases from archived 
serum samples in 2001 in China [23]. 

SARS-related CoVs in bats 

Following its emergence in 2003, SARS was quickly 
identified as a zoonotic virus, and the identification of 
the wet markets as a potential source may have assisted 
epidemiological control of the disease [24]. While palm 
civets, raccoon dogs, and horseshoe bats ( Rhinolophus 
genus) have all been identified as hosts of SARS-like 
CoVs, it is suggested that only the horseshoe bats are 
likely reservoir hosts. Bats are widely distributed, highly 
diverse, and extremely mobile mammals with an estab¬ 
lished role as hosts of emergent RNA viruses. Corona- 
viruses occupy an exceptionally wide distribution in bats; 
recent surveillance studies have extended our recognition 
of this range to Africa, Europe, South America, and North 
America [4,9,25-27]. The genetic variation encoded 
within many recently discovered coronaviruses hosted 
by bats is far greater than the diversity noted between 
many human coronaviruses, despite a proportionally small 
sampling of the ~1200 bat species, leading some 
researchers to speculate that all mammalian coronaviruses 
are derived from bat reservoir strains [4,28]. The exten¬ 
sive sequence diversity provides considerable opportu¬ 
nity for the emergence of new animal and human 
coronaviruses, which would be sufficiently antigenically 
distinct as to not be influenced by preexisting exposure 
and memory immune responses to established human 
CoVs. For example, little antigenic cross reactivity exists 
between the S glycoproteins of more distantly related 
group 2b bat coronaviruses and the SARS-CoV [29]. From 
a historic context, the next emergent event is likely 
dependent only on ecological and epidemiological situ¬ 
ations and time, as the viral potential is well established 
[30,31]. 

Repeated efforts have been made in recent years to 
identify the zoonotic reservoir and path of emergence 
for SARS-CoV, both by sampling zoonotic populations 
and by attempting to clarify SARS-CoV receptor usage in 
alternate hosts. A recent study attempting to address the 
paucity of bat SARS-related coronavirus sequences gath¬ 
ered and analyzed SARS-related coronaviruses in Rhino¬ 
lophus bats (SARSr-Rh-BatCoV) (Rp3) genomes from 
horseshoe bats in China [32*]. Interestingly, several bats 
sampled were coinfected with HKU2, an alphacorona- 
virus, providing direct evidence that individual bats can 
host divergent coronaviruses, even across groups. Further, 
tagging and clinical assessment of infected bats over a 
four-year period showed only minor weight loss associ¬ 
ated with Rp3 infection, with viral clearance occurring 
between two weeks and four months. Analysis of the ten 
novel genomes gathered in this study combined with 
previously published sequences demonstrated evidence 
of frequent recombination between the strains. They also 
note a 26-bp deletion in ORF8 near, but not identical to, 


the 29-bp deletion seen in human SARS-CoV epidemic 
strains, suggesting ORF8 may undergo frequent deletions 

[32*]. 

ACE2 is the receptor for SARS-CoV, but following the 
identification of several SARS-like CoVs (SL-CoVs) in 
horseshoe bats (genus Rhinophus ) the ACE2 molecule of 
R. pearsonii proved incapable of serving a receptor for 
SARS-CoV [3,33]. These and other initial studies 
suggested that the ancestral SARS-CoV strain in bats 
used an alternate receptor and that the emergence of 
SARS-CoV was dependent upon either acquisition of an 
ACE2 binding region or initial utilization of an alternative 
human receptor [33]. However, while human ACE2 is 
genetically conserved, the bat ACE2 sequences are 
highly heterogeneous, with 78-84% amino acid identity 
between families [34,35]. Despite this heterogeneity, the 
residues that interface with the SARS Spike-receptor 
binding domain (RBD) are more conserved [36]. A recent 
study determined that a minimum three substitutions in 
the ACE2 of R. pearsonii (RpACE2) allowed this protein 
to serve as a receptor for SARS [37]. Looking more 
broadly at the ACE2 molecules from seven bat species, 
the ACE2 proteins from Myotis daubentoni and Rhinolophus 
sinicus are capable of supporting Spike-mediated pseu¬ 
dovirus and SARS-CoV infection, though less efficiently 
than human ACE2 [34]. Assessment of receptor usage by 
early phase and civet isolate Spike proteins might better 
inform our understanding of emergence pathways, deter¬ 
mining if SL-CoV jumped directly from bat to human 
hosts or whether civet or other intermediate hosts were 
required as early intermediates before human adaptation. 

Although original data suggested a bat to civet to human 
origin, evidence supporting direct bat to human trans¬ 
mission of SL-CoV emerged from recent phylogenetic 
studies, in addition to the receptor studies mentioned 
above (Figure lb). Initially, a reanalysis of published 
genome sequences developed phylogenies using out¬ 
groups that were non-SARS-CoV sequences, designed 
to test the monophyly of the SARS-CoV sequences 
[38]. Under this assessment, bat isolates are ancestral 
host to all SARS-CoVs, while civet and raccoon dog 
sequences (small carnivores), as well as pig isolates cluster 
within the human SARS-CoV sequences. The small 
carnivore CoVs are consistently shown to be terminal 
branches with human CoVs intermediate, with later 
transmission of CoV between carnivores and humans 
responsible for isolated cases such as GD03, a late phase 
human isolate that phylogenetically clusters with civet 
rather than human sequences [38]. This phylogeny there¬ 
fore supports a direct bat to human transmission, with 
subsequent and bidirectional transmissions between civet 
cats, raccoon dogs, and humans. 

A more recent study analyzed CoV sequences gathered 
from 24 R. sinicus bats in geographically distant regions of 
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Figure 2 
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Sequence changes over the SARS-CoV epidemic. Shown here are the most significant changes important for transition of SARS from civet to early, 
middle and late phases of epidemic strains. Mutations indicative of lineages that were not likely to have contributed to the expansion to other phases 
have been removed. All other positions in the genome are identical. 


China, characterizing two distinct genotypes, Rs672 and 
Rs806 [39*]. Interestingly, one sequence (Rs672) and the 
previously published Rp3 are shown in a monophyly more 
closely related to human-SCoV than to bat SARS-like 
CoV strains, based on the strong similarity of Rs672 
ORFla/b region to human SARS sequences. This study 
also provided further evidence of recombination between 
Bat-SL-CoV, with a recombination breakpoint identified 
immediately after the start codon of Spike, identical to 
the recombination position in the Rp3 genome [39*]. The 
combination of highly diverse BtCoV species and diver¬ 
gent ACE2 molecules among bat hosts suggests direct bat 
to human transmission may be feasible. Thus, the field is 
left with two potentially competing models for the origins 
of the SARS-CoV epidemic: first, transmission from bats 
to an intermediate amplifying reservoir in small carni¬ 
vores, with subsequent transmission to humans or second, 
direct bat to human transmission followed by cross-trans¬ 
mission between humans and civets and raccoon dogs 
(Figure lb). In both models civets and raccoon dogs serve 
as key amplifying hosts for virus persistence and reintro¬ 
duction into human populations. 

Genesis of an epidemic 

A chronological set of SARS-CoV sequence changes 
spanning the SARS outbreak provided an unparalleled 
opportunity to identify the genetic basis for zoonotic virus 
cross-species transmission and human adaptation during 
an expanding epidemic. Molecular changes noted at with 


the end of the early phase and expansion into the middle 
phase of the epidemic include A 3047 V, A 3 o 7 2 V in the 
replicase and D 778 Y and perhaps En^K in the Spike 
gene. Transition from the middle to late phase of the 
epidemic included A 2552 V in ORFla, E 1389 D in ORFlb, 
D 77 G and T 244 I in the S gene, respectively (Figure 2) 
[40]. It has been hypothesized that these alterations were 
key to an expanding epidemic, yet empirical data to 
support these claims and functional significance of these 
alterations remain unavailable. For example, it is not clear 
whether the ORF 8 29 bp deletion is central for human 
adaptation as suggested, or a genetic hitchhiker amplified 
and maintained following a selective sweep mediated by 
other beneficial mutations located elsewhere in the gen¬ 
ome [3,40]. In addition to these changes, the SARS-CoV 
Spike glycoprotein was under strong positive selection, 
with 23 substitutions evolving during the expanding 
phases of the epidemic [41]. Experimental evidence 
suggests both adaptation to ACE2 and antibody selection 
contributed to Spike changes [40,42]. 

Coronavirus cross-species transmission: role 
of Spike-receptor interactions in viral entry 

Coronavirus-receptor interactions are key determinants 
regulating host range, cross-species transmission, and 
tissue tropism. The various coronaviruses demonstrate 
broad receptor and coreceptor usage, from proteases such 
as aminopeptidase N for transmissible gastroenteritis 
virus (TGEV), canine-CoV, feline infectious peritonitis 
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virus (FIPV), and HCoV-229E, to cell adhesion molecules 
such as CEACAMla for MHV, to sugars as coreceptors for 
some alpha, beta, and gammacoronaviruses [36,43,44]. 
This diverse receptor usage directly impacts host range 
and tissue tropism as demonstrated by the closely related 
PRCoV and TGEV. PRCoV lacks the sugar-binding 
region of TGEV, and consequently is limited to a respir¬ 
atory rather than enteric tropism [45]. The recently 
crystallized structure of the group 2a coronavirus MHV 
complexed with its receptor, murine CEACAMla, 
emphasizes again the broad diversity and flexibility of 
CoV Spike glycoproteins. The core structure of the MHV 
RBD is hypothesized to have been derived from a host 
sugar-binding protein (galectin) and subsequently modi¬ 
fied to allow mCEACAMla binding, thus enhancing 
MHV affinity for host cells [43]. Other coronaviruses 
encode a second putative viral attachment protein, the 
hemagglutinin esterase (HE), which was likely derived 
from influenza C strains by recombinatory mechanisms 
[46]. Coronaviruses selected in vitro to broaden host range 
oftentimes mutate to bind heparin sulfate for docking and 
entry [47]. It is notable that OC43 and BCoV have 
carbohydrate (sialic acid) binding capacities, as well as 
broader host ranges [48], The capacity to bind carbo¬ 
hydrates for docking and entry may provide an additional 
pathway for coronavirus host range expansion, cross¬ 
species transmission, and disease emergence, and 
requires further study. 

The key determinant of SARS coronavirus host speci¬ 
ficity is the Spike glycoprotein, an envelope-anchored 
trimeric protein responsible for binding human ACE2 as 
the principle receptor for virus docking and entry. SARS- 
CoV S glycoprotein also binds C-type lectins like DC- 
Sign and/or L-Sign as a coreceptor, an interaction which 
is blocked by mannose binding lectin [49,50]. Impor¬ 
tantly, SARS-CoV docking and entry is also highly de¬ 
pendent upon transmembrane protease/serine subfamily 
member 2 (TMPRSS2) S and ACE2 cleavage, especially 
in airway and alveolar sites, and cathepsin L cleavage and 
subsequent S2 fusion activation [51-53]. Several studies 
in the past two years have worked to clarify the plasticity 
of this protein, with particular emphasis on the RBD. 
The Spike glycoprotein underwent rapid evolution 
during the human epidemic [40], was the most signifi¬ 
cantly variable protein across civet and human isolates 
[22], and shows evidence of positive selection during 
both interspecies and intraspecies transmission events 
[10,22,40,54]. The SARS Spike can recognize and use 
bat, civet, mouse, and raccoon dog ACE2 receptor mol¬ 
ecules for docking and entry, indicating that SARS traf¬ 
ficked along receptor ortholog networks to move 
between species [34,55,56]. As several alphacoronavir- 
idae also use APN from different species, these data 
suggest a common theme in coronavirus host range 
switching: recognizing receptor orthologs from different 
species [36]. Additionally, the role of different ortholog 


proteases for facilitating coronavirus S glycoprotein clea¬ 
vage and entry processes remains undefined, and could 
significantly contribute to the efficiency of virus cross¬ 
species transmission processes. 

SARS-CoV replicates but does not produce clinical dis¬ 
ease in mice. Two experimental adaptations of SARS- 
CoV to murine hosts by serial passage independently 
identified a substitution in the Spike gene at residue 
436 which alone has been shown to enhance infectivity 
and pathogenesis in mice, and is predicted to allow 
stronger binding to the murine ACE2 receptor 
[29,57,58*]. However, substitutions outside of Spike are 
necessary for the full lethal disease phenotype in MA15, 
and presumably also in v2163 [57]. For example, two 
other proteins, nsp9 and nspl3, contained mutations in 
both mouse-adapted strains, MA15 and v2163. Addition¬ 
ally, single substitutions in the M gene are common to 
MA15 and adaptation to persistent infection of human 
tubular kidney cells, suggesting the M protein influences 
tropism or pathogenesis by facilitating the efficiency of 
particle egress [59]. The substitutions common to both 
mouse-adapted strains suggest potential SARS-CoV viru¬ 
lence factors in the later stages of adaption to a novel host, 
and indicate potential mutation driven emergence path¬ 
ways. The mouse-adapted viruses may not represent true 
cross-species transmission events, as SARS could already 
replicate in the mouse lung, but it is notable that the most 
conserved change in both mouse-adapted strains 
enhances receptor binding at the same Spike residue. 
Further, serological studies indicate multiple cross¬ 
species transmissions into humans in the years before 
the epidemic, suggesting that the virulence factors con¬ 
tributing to the later stages of adaptation to novel hosts, in 
Spike or elsewhere, are critically important [23]. 

The RBD (aa318-510) is the strongest determinant of 
host range for SARS-CoV and other coronaviruses [29]. 
Single substitutions within the RBD can significantly 
affect the binding affinity of Spike to its receptor [60]. 
Indeed, a minimum of 1-2 substitutions in the RBD are 
sufficient to allow the virus to alter host receptor speci¬ 
ficity [61]. Experimental adaptation of civet-Spiked SAR- 
S virus to human ACE2 receptor by Sheahan et al. 
demonstrated the minimal requirements for host range 
expansion. In these studies, a civet-Spiked SARS-CoV 
was incapable of propagating in Vero cells until a human- 
tropic substitution was introduced at residue 479. When 
the civet-Spiked virus included the K479T substitution it 
was capable of propagating on Vero cells and further 
capable of replicating on human airway epithelial cells 
(HAE) and hACE2-expressing DBT cells, demonstrating 
that single substitutions are capable of expanding the 
virus host range. Interestingly, when the K479T-civet- 
SARS was experimentally selected for enhanced replica¬ 
tion on human airway epithelial cells, the substitutions 
that improved replication did not exactly replicate the 
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substitutions seen in the epidemic strains. Rather, an 
initial substitution at 479 was necessary for the civet- 
SARS to use primate ACE2 and propagate in Vero cells, 
but the adaptive mutations following passage on human 
airway epithelial cells (HAE) selected for substitutions at 
two different contact interface sites at residues 442 and 
472, rather than the 487 site identified in the epidemic 
strain [61,62]. This suggests that multiple genetic path¬ 
ways exist which can improve S RBD-human AGE2 
receptor interactions, providing the virus with multiple 
strategies to adapt to new host species [56]. It is inter¬ 
esting to note that this alternative pathway for recogniz¬ 
ing hACE2 ablated interactions with the cACE2 receptor, 
supporting the hypothesis that epidemic SARS-CoV 
strains were coselected to efficiently recognize both civet 
and human ACE2 receptors. 

Antibodies that neutralize SARS-CoV predominantly 
bind to the RBD of Spike. Rockx et al. selected and 
sequenced a number of different escape mutants to a 
panel of 23 human monoclonal antibodies, the majority of 
which contained single substitutions along the RBD 
interface with ACE2 [63""]. All but one escape site 
mapped within 4 angstroms of contact interface residues, 
and yet all viruses grew to comparable peak titers in Vero 
and hACE2-restricted DBT cells. However, growth on 
civet-ACE2-restricted DBT cells was restricted for all 
escape viruses, suggesting that escape from antibody 
neutralization can alter Spike-receptor binding and, 
consequently, host range [63**]. That antibody escape 


variants can stably adopt substitutions in the Spike- 
ACE2 receptor interface suggests that the host response 
to an infection may select for host range variants by a 
mutation-driven mechanism. 

Extensive structural modeling tools are available to pre¬ 
dict receptor binding, antibody neutralization, or the 
stability of substitutions within the RBD of the SCoV 
Spike. Three coronavirus Spike-RBDs have been com- 
plexed with receptors to date, allowing for prediction and 
validation of the structural determinants of binding to 
host and orthologous receptors (Figure 3). Application of 
mathematical modeling to Spike-receptor and Spike- 
antibody structural models allowed for the prediction 
of escape substitutions with a high probability of fixation 
in a viral population [64]. These predictions are partially 
in accordance with published data, predicting selection 
with antibody 80R would select for a substitution at D480 
of Spike, as seen in vitro following SARS-CoV escape 
from 80R neutralization [64,65]. 

Plasticity of the Spike glycoprotein 

The coronavirus Spike glycoprotein is remarkably plastic, 
capable of accommodating mutations and deletions up to 
479 (MHV) or 681 nucleotides (PRCoV) while retaining 
receptor binding and entry functions [66-68]. To date, 
large deletions in the SARS-CoV S glycoprotein have not 
been reported. The S protein is divided into discrete 
domains: an N-terminal domain, RBD, two heptad 
repeats, a transmembrane anchor, and an intracellular tail 


Figure 3 



Crystal structures of coronavirus receptor binding domains (RBDs) complexed with their receptors. To date, the crystal structures of three coronavirus 
Spike RBD-receptor complexes have been solved: (a) the RBDs of SARS complexed with human ACE2 (pdb 2AJF) [73], (b) NL63 complexed with 
human ACE2 (3KBH) [71**], and (c) MHV complexed with murine CEACAMIIa (3R4D) [43]. 
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[43]. Discrete regions can be exchanged between strains 
while preserving both protein function and antibody 
binding [29,3b]. Multiple coincident substitutions as well 
as contact interface site substitutions can be tolerated to 
allow escape from antibody neutralization while main¬ 
taining receptor specificity [42,60,69,70]. This flexibility 
allows for multiple genetic pathways from the use of 
zoonotic receptors to the human ACE2 receptor [56]. 

Diversity and flexibility of the Spike glycoprotein is charac¬ 
teristic of coronaviruses beyond SARS-CoV. The lack of a 
clear ACE2 receptor binding motif (RBM) in the horseshoe 
Bat-SL-CoV Spike, and the inability to use hACE2 as a 
receptor, led to an early hypothesis that the human-SCoV 
emerged from Bat-SL-CoV following a recombination 
event, perhaps with a NL63-like CoV, as NL63 also uses 
ACE2 as a receptor. Such a recombination event would 
have allowed direct acquisition of an ACE2 binding motif 
and the resulting cross-species transmission [35]. Alterna¬ 
tively, SARS-CoV used batACE2 for docking and entry 
and introduction into human/civet populations selected for 
mutations that enhanced interaction with the civet or 
humanACE2 receptor. The recently published crystal 
structure of NL63-CoV complexed with the ACE2 re¬ 
ceptor shows no structural homology with the SARS- 
CoV RBM or the core RBD (Figure 3) [71**,72]. This 
suggests that convergent evolution, rather than recombi¬ 
nation-mediated transfer, lead to the common use of ACE2 
by NL63 and SARS-CoV [72]. 

Early data suggested that the RBD of SARS-CoV and 
perhaps HCoV-NL63 were derived by recombination 
processes, rather than mutation driven evolution. While 
these ideas remain highly speculative, these data 
suggested that the S glycoprotein RBDs and/or fusion 
cores of CoVs may be interchangeable between distant 
strains. In support of this hypothesis, the consensus bat 
SARS-like genome HKU3 was replication competent, but 
was not sufficient for sequential rounds of infection, 
presumably because of the lack of appropriate receptors 
for docking and entry. The insertion of the SARS RBD 
into the HKU3 Spike allowed for the production of 
progeny virus that grew to high titer in ACE2-expressing 
DBT cells, and was capable of replicating in human 
airway epithelial cells and mouse lungs, although it grew 
with reduced efficiency in the latter [29]. Thus, under 
certain conditions, recombination processes can result in 
bat CoV host shifting. Further, the bat-SARS-like coro- 
navirus with the SARS RBD was capable of replicating in 
mouse lungs, although with greatly reduced efficiency. It 
is notable that attempts to isolate CoV from bats have 
repeatedly failed, limiting our ability to study adaptive 
mechanisms or pathogenesis of CoV in host species, but 
that synthetic biology provides alternative sources of 
these viruses. The construction of a synthetic bat SAR¬ 
S-like coronavirus provided strong evidence that the 
interspecies movement of coronaviruses, specifically 


SARS-like coronaviruses, resides strongly in the RBD 
[29]. While previous studies had indicated that small 
changes in the Spike glycoprotein could alter host speci¬ 
ficity of coronaviruses, the sufficiency of a discrete RBD 
change in the context of a divergent 30 kb genome 
demonstrates the RBD is a minimum determinant of 
species tropism. Further, it suggests a potential mechan¬ 
ism of host range expansion, suggesting both recombina¬ 
tion and single substitution events allow for infection of 
novel hosts. Determining receptor specificities for these 
novel bat coronaviruses offer considerable opportunity to 
enrich our understanding of coronavirus-receptor inter¬ 
actions, identify new receptors that coronaviruses use for 
docking and entry, and provide novel models for studying 
the ease and mechanism of cross-species transmission. 

Conclusions 

Fundamental insights into the molecular mechanisms 
and pathways that govern virus cross-species transmission 
are central to protecting global health. Coronaviruses 
readily traffic between host species and the Spike glyco¬ 
protein is the most extensively characterized viral deter¬ 
minant of host range expansion. Binding of the 
coronavirus Spike to the host receptor is the minimum 
determinant of infectivity and species specificity, and 
many recent studies have demonstrated the ability of S 
RBD to mutate and engage ortholog receptors or escape 
antibody neutralization [61,63**]. We need to know more 
about the breath of novel coronavirus receptors that are 
used in nature and the mechanisms governing ortholog 
receptor recognition. Importantly, the coronavirus RBD 
interface is a robust iterative model for predicting struc¬ 
ture-function relationships between mutation-driven 
host range expansion, virus-receptor interactions, and 
antibody binding and neutralization. The SARS S-RBD 
model captures highly regulated variables that recapitu¬ 
late real-life biological processes critical for coronavirus 
cross-species transmission and host immune response 
(Figure 4). The SARS RBD-receptor-neutralizing anti¬ 
body interface provides considerable opportunity for pre¬ 
dicting and studying the role of mutations in cross-species 
transmission and immunity. In addition, recent work has 
also expanded our appreciation of how intragenic recom¬ 
bination may influence coronavirus host range, as evi¬ 
denced by targeted recombination, recombination 
between different bat coronaviruses, and identification 
of the RBD as a minimum determinant of host range 
expansion [29,39*]. While the precise ancestor and route 
of emergence for SARS-CoV remains unidentified, exten¬ 
sive sampling and phylogenetic studies of bat CoVs has 
raised the possibility that the epidemic strain may have 
jumped directly to humans before jumping to civets. 
Thus, future coronavirus epidemics may be more fre¬ 
quent than appreciated as compared with a two-step 
emergence model that required an intermediate host. 
Additionally, while it remains unclear whether recombi¬ 
nation and/or mutation of Spike mediated the emergence 
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Figure 4 
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Experimental evolution at the SARS S glycoprotein RBD-ligand interface. The SARS RBD is heterogeneous and includes defined sequence variation at 
specific residues that engage the ACE2 receptor from different species (Parts 1 and 2). Bioinformatics can be used to predict and then test the impact 
of targeted mutations on variant virus-receptor interactions. Iterative rounds of mutation driven selection are also possible using recombinant viruses 
encoding targeted mutations and variant ACE2 receptors for docking and entry. The model allows a deep structural understanding of the potential 
pathways and molecular mechanisms that govern cross-species transmission and pathogenesis. The biological impact of host shifting on antigenicity 
can be predicted using structural models of antibody-RBD interfaces, and then studied using a panel of well characterized human and mouse 
monoclonal antibodies targeting the different SARS-CoV RBD domains (Part 3). In parallel, neutralizing monoclonal antibodies can be used to select for 
escape mutations (Part 4), allowing for iterative rounds of prediction and testing on how these mutations impact host range and ACE2 recognition. 


of SARS, both mechanisms can readily impact corona- 
virus host range. Future studies are needed to clarify the 
potential roles of host proteases or antibody mediated 
selection in cross-species transmission, and to clarify 
whether modulation of RNA proof-reading activity could 
impact viral adaptation to a novel host. Further, structural 
and mathematical modeling tools offer novel predictive 
capabilities that, when integrated with experimental stu¬ 
dies, will assist in predicting the ease of cross-species 
transmission, the mechanisms of emergence, and contrib¬ 
ute to improvements in therapeutic design. 
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